Evolving Multi-Resolution Pooling CNN for Monaural Singing Voice Separation

نویسندگان

چکیده

Monaural singing voice separation (MSVS) is a challenging task and has been extensively studied. Deep neural networks (DNNs) are current state-of-the-art methods for MSVS. However, they often designed manually, which time-consuming error-prone. They also pre-defined, thus cannot adapt their structures to the training data. To address these issues, we first multi-resolution convolutional network (CNN) MSVS called pooling CNN (MRP-CNN), uses various-sized operators extract features. We then introduced Neural Architecture Search (NAS) extend MRP-CNN evolving (E-MRP-CNN) automatically search effective using genetic algorithms optimized in terms of single objective taking into account only performance multiple objectives both model complexity. The E-MRP-CNN multi-objective algorithm gives set Pareto-optimal solutions, each providing trade-off between Evaluations on MIR-1 K, DSD100, MUSDB18 datasets were used demonstrate advantages over several recent baselines.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Singing Voice Separation from Monaural Recordings

Separating singing voice from music accompaniment has wide applications in areas such as automatic lyrics recognition and alignment, singer identification, and music information retrieval. Compared to the extensive studies of speech separation, singing voice separation has been little explored. We propose a system to separate singing voice from music accompaniment from monaural recordings. The ...

متن کامل

Bayesian Singing-Voice Separation

This paper presents a Bayesian nonnegative matrix factorization (NMF) approach to extract singing voice from background music accompaniment. Using this approach, the likelihood function based on NMF is represented by a Poisson distribution and the NMF parameters, consisting of basis and weight matrices, are characterized by the exponential priors. A variational Bayesian expectationmaximization ...

متن کامل

Deep Clustering for Singing Voice Separation

This extended abstract describes the system we submitted for the singing voice separation task of MIREX 2016. Our submission here is an extension of the deep clustering network from [1].

متن کامل

Real-time Online Singing Voice Separation from Monaural Recordings Using Robust Low-rank Modeling

Separating the leading vocals from the musical accompaniment is a challenging task that appears naturally in several music processing applications. Robust principal component analysis (RPCA) has been recently employed to this problem producing very successful results. The method decomposes the signal into a low-rank component corresponding to the accompaniment with its repetitive structure, and...

متن کامل

Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask

Singing voice separation based on deep learning relies on the usage of time-frequency masking. In many cases the masking process is not a learnable function or is not encapsulated into the deep learning optimization. Consequently, most of the existing methods rely on a post processing step using the generalized Wiener filtering. This work proposes a method that learns and optimizes (during trai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2021

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2021.3051331